NoL – Intro to R

Mapping biodiversity and visualization!
Author
Affiliation
Published

June 12, 2024

Today we will learn basic tools in R for visualizing species distributions.

You will need three datasets, that will be provided for you:

  1. Species occurrence data points – live.oaks.csv

  2. Species geograhical ranges – furnariides_sample.shp

  3. Environmental variables – bio1.bil and bio12.bil

1 Set up your data and your working directory

Set up a working directory and put the data files in that directory. Tell R that this is the directory you will be using, and read in your data:

Code
setwd("your path name")

To do this laboratory you will need to have a set of R packages. Install the following packages:

Code
packages <- c("tidyverse", "sf", "scico", "rnaturalearth", 
              "rnaturalearthdata", "viridis", 
              "terra", "rasterVis", "RColorBrewer") 
# Package vector names
Function install.packages()

You can use the function install.packages() to install the packages.

If you don’t want to install the packages one by one, you can use the next command.

Code
# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())

if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages], dependencies = TRUE)
}

This command, will, first, check if you already the packages installed, then if a package is not installed in your computer, will install it.

When installing multiple packages, please pay attention to the messages in your R console. In some cases, R will ask you if you want to install the source version. If that is the case, just type n and hit enter.

Load installed packages:

Code
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library(scico)
library(rnaturalearth)
library(terra)
terra 1.7.71

Attaching package: 'terra'

The following object is masked from 'package:tidyr':

    extract
Code
library(sf)
Linking to GEOS 3.12.1, GDAL 3.8.5, PROJ 9.4.0; sf_use_s2() is TRUE

Double-check your working directory.

Function getwd()

You can use the function getwd() to get the current working directory.

2 From point occurrences to range maps

We will start setting the geographical extent of our study area and to do that we will use spatial data from the package {rnaturalearth}.

Code
sf_use_s2(FALSE)
Spherical geometry (s2) switched off
Code
# Get world map from the {rnaturalearthdata} package
worldMap <- ne_countries(scale = "medium", 
                         type = "countries", 
                         returnclass = "sf")

# cCountry subset - The United States and Mexico
NApoly <- worldMap %>% 
  filter(admin == "United States of America" | admin == "Mexico")

# trim to study area
limsNA <- st_buffer(NApoly, dist = 1) %>% 
  st_bbox() 
Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
endCapStyle, : st_buffer does not correctly buffer longitude/latitude data
dist is assumed to be in decimal degrees (arc_degrees).
Code
# neighboring countries
adjacentPolys <- st_touches(NApoly, worldMap)
although coordinates are longitude/latitude, st_touches assumes that they are
planar
Code
neighbours <- worldMap %>% 
  slice(pluck(adjacentPolys, 1))

We can plot the resulting map.

Code
ggplot() +
  geom_sf(data = neighbours, color = "white") +
  geom_sf(data = NApoly) +
  coord_sf(
    xlim = c(limsNA["xmin"], limsNA["xmax"]),
    ylim = c(limsNA["ymin"], limsNA["ymax"])
  ) +
  theme(
    plot.background = element_rect(fill = "#f1f2f3"),
    panel.background = element_rect(fill = "lightblue"),
    panel.grid = element_blank(),
    line = element_blank(),
    rect = element_blank()
  )

Hmmmm, that is somewhat ugly, let’s adjust the coordinates a bit the map and plot it again…

Code
ggplot() +
  geom_sf(data = neighbours, color = "white") +
  geom_sf(data = NApoly) +
  coord_sf(
     xlim = c(-125, -65),
    ylim = c(10, 50)
  ) +
  theme(
    plot.background = element_rect(fill = "#f1f2f3"),
    panel.background = element_rect(fill = "lightblue"),
    panel.grid = element_blank(),
    line = element_blank(),
    rect = element_blank()
  )

Much, much better!

Load species occurrences data points. We will use occurrences from Live oaks, that were obtained from iDigBio between 20 and 24 July 2018 by Jeannine Cavender-Bares. Notice that these occurrence data points were visually examined and any localities that were outside the known range of the species, or in unrealistic locations (e.g., water bodies, crop fields) were discarded.

Code
oaks_occ <- read_delim("data/live.oaks.csv") %>% 
  filter(Species != "Hybrid") # removing hybrid observations
Rows: 672 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Species
dbl (2): Longitude, Latitude

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
oaks_occ %>% 
  count(Species) # check how many species and how many observations per species
# A tibble: 7 × 2
  Species                n
  <chr>              <int>
1 Quercus_brandegei     35
2 Quercus_fusiformis    76
3 Quercus_geminata      51
4 Quercus_minima        38
5 Quercus_oleoides     292
6 Quercus_sagraena      40
7 Quercus_virginiana   134
Code
# Visualize

oaks_occ %>% 
  count(Species) %>% 
  ggplot(aes(x = Species, y = n)) + 
  geom_bar(stat = "identity") + 
  coord_flip() + 
  ylab("Number of observations") + 
  theme_bw()

Transform data.frame to spatial data.frame

Code
# to sf object, specifying variables with coordinates and projection
oaks_occ_sf <- st_as_sf(oaks_occ, # data.frame object
                        coords = c("Longitude", "Latitude"), # coordinates names
                        crs = 4326) %>% # coordinates system
  #group_by(species) %>%
  st_cast("MULTIPOINT") %>% 
  group_by(Species) %>% 
  summarize()
although coordinates are longitude/latitude, st_union assumes that they are
planar
Code
glimpse(oaks_occ_sf)
Rows: 7
Columns: 2
$ Species  <chr> "Quercus_brandegei", "Quercus_fusiformis", "Quercus_geminata"…
$ geometry <MULTIPOINT [°]> MULTIPOINT ((-110.1556 23.7..., MULTIPOINT ((-100.…

What variables we have in the oaks_occ object?

Answer: There are three variables, species, longitude and latitude.

How many oak species are in the dataset?

Answer: There are seven oak species.

Code
unique(oaks_occ$Species)
[1] "Quercus_virginiana" "Quercus_geminata"   "Quercus_minima"    
[4] "Quercus_fusiformis" "Quercus_oleoides"   "Quercus_brandegei" 
[7] "Quercus_sagraena"  

What we did in the previous code was simply to transform a the data.frame object into a spatial data.frame. We can plot the results.

Code
ggplot() +
  geom_sf(data = neighbours, color = "white") +
  geom_sf(data = NApoly) + 
  geom_sf(data = oaks_occ_sf, aes(color = Species), alpha = 0.7) + 
  coord_sf(
     xlim = c(-125, -65),
    ylim = c(10, 50)
  ) +
  theme(
    plot.background = element_rect(fill = "#f1f2f3"),
    panel.background = element_rect(fill = "lightblue"),
    panel.grid = element_blank(),
    line = element_blank(),
    rect = element_blank()
  )

Nice!

2.1 Range maps from point data

In this section we will learn how to create “simple” range maps based on geometry (e.g. minimum convex polygons, etc.), without considering environmental variables (e.g., ENMs or SDMs). Note that these range maps are geographical abstractions of the species ranges. In other words, a species range is the area where a particular species can be found during its lifetime. Species range includes areas where individuals or communities can migrate or hibernate

We will explore two alternative, one based on simple convex hull and the other is the smoothed convex hull

2.1.1 Convex hull

Code
# Observations to convex hull
oaks_CH <- st_convex_hull(oaks_occ_sf) 

# plot hulls
ggplot() +
  geom_sf(data = neighbours, color = "white") +
  geom_sf(data = NApoly) +
  geom_sf(data = oaks_CH, aes(fill = Species), alpha = 0.7) +
  scale_fill_scico_d(palette = "davos", direction = -1, end = 0.9) +
  coord_sf(
    xlim = c(-125, -65),
    ylim = c(10, 50)
  ) +
  theme(
    plot.background = element_rect(fill = "#f1f2f3"),
    panel.background = element_rect(fill = "lightblue"),
    panel.grid = element_blank(),
    line = element_blank(),
    rect = element_blank()
  )

Until here we have explored how to plot, clean and build species geographical ranges using occurrences. Now we will use a sample of species geographical ranges of the largest continental endemic radiation (Furnariides) to explore the geographical gradients of species diversity.

3 Diversity gradients

3.1 Prepare data and mapping

The geographical ranges correspond to the Infraorder Furnariides (Aves). This data is available thorough BirdLife International and you can use any other group available on IUCN or BIEN (for plants in the Americas). In any case, you first need to download the polygons in shapefile format.

To load the Furnariides geographical ranges, we will use the function st_read() from the package {sf}. Please read the message printed on your console and try to understand the data.

Code
franges <- st_read("data/furnariides_sample.shp") 
Reading layer `furnariides_sample' from data source 
  `/Users/jpintole/Library/CloudStorage/Dropbox/Teaching/NOL_Itasca/NoL-2024/data/furnariides_sample.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 217 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -110.0817 ymin: -55.98222 xmax: -34.79012 ymax: 28.50845
Geodetic CRS:  WGS 84

Explore the imported data.

Code
class(franges)
[1] "sf"         "data.frame"

Now see all the data information.

Code
glimpse(franges)
Rows: 217
Columns: 16
$ SCINAME    <chr> "Automolus infuscatus", "Thamnophilus nigriceps", "Formiciv…
$ Perimeter  <dbl> 191.602729, 34.725692, 143.592812, 36.615112, 56.535598, 18…
$ Area       <dbl> 489.178106, 17.264407, 182.345788, 41.179272, 13.231539, 28…
$ AreaKM2    <dbl> 4891781.06, 172644.07, 1823457.88, 411792.72, 132315.39, 28…
$ RD         <dbl> 17, 16, 13, 14, 15, 13, 20, 26, 14, 10, 15, 15, 20, 9, 21, …
$ DR         <dbl> 0.29114, 0.21878, 0.21958, 0.17335, 0.10925, 0.12006, 0.488…
$ BodyMass   <dbl> 32.90, 22.90, 10.00, 12.29, 12.40, 14.10, 36.00, 18.30, 10.…
$ DietInv    <dbl> 90, 100, 100, 100, 100, 100, 100, 90, 100, 50, 100, 100, 10…
$ DietFruit  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ DietSeed   <dbl> 0, 0, 0, 0, 0, 0, 0, 10, 0, 50, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ StratGroun <dbl> 0, 20, 20, 30, 30, 50, 0, 50, 0, 100, 0, 0, 50, 100, 0, 90,…
$ Understory <dbl> 80, 60, 40, 40, 70, 50, 50, 50, 50, 0, 0, 0, 50, 0, 60, 10,…
$ Midhigh    <dbl> 20, 20, 40, 30, 0, 0, 50, 0, 50, 0, 50, 50, 0, 0, 40, 0, 0,…
$ Canopy     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 50, 50, 0, 0, 0, 0, 0, 0, 0, …
$ Ages       <dbl> 1.339967, 2.552121, 5.693174, 1.705715, 6.082210, 6.714462,…
$ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-78.33172 -..., MULTIPOLYGON (…

What variables are present in the spatial polygon object? and How many species?

Answer: There are 217 species in this spatial object. The variables included are: Perimeter, Area, AreaKM2, RD, DR, BodyMass, DietInv, DietFruit, DietSeed, StratGroun, Understory, Midhigh, Canopy, Ages.

Code
length(unique(franges$SCINAME))
[1] 217
Code
frangesVars <- names(franges)[2:15]

frangesVars
 [1] "Perimeter"  "Area"       "AreaKM2"    "RD"         "DR"        
 [6] "BodyMass"   "DietInv"    "DietFruit"  "DietSeed"   "StratGroun"
[11] "Understory" "Midhigh"    "Canopy"     "Ages"      

Let’s plot a couple of species.

Code
selSPP <- franges %>% 
  filter(SCINAME == "Batara cinerea" | SCINAME == "Asthenes modesta")

# country subset
SApoly <- worldMap %>% 
  filter(continent == "South America")
Code
# plot the selected ranges
ggplot() +
  geom_sf(data = SApoly, alpha = 0.5) +
  geom_sf(data = selSPP, aes(fill = SCINAME), alpha = 0.7, size = 2) +
  scale_fill_scico_d(palette = "davos", direction = 1, 
                     end = 0.9) +
  coord_sf(
    xlim = c(-80, -35),
    ylim = c(10, -60)
  ) + 
  theme_classic() + 
  theme(
    plot.background = element_rect(fill = "#f1f2f3"),
    panel.background = element_rect(fill = "lightblue"),
    panel.grid = element_blank(),
    line = element_blank(),
    rect = element_blank()
  )

Explain the distribution for both species (i.e., Asthenes modesta [blue polygon] and Batara cinerea [yellow polygon]) Are these species distributed in sympatry or allopatry? Explain the selected distribution pattern.

Answer: A. modesta is widely distributed mostly in the highlands of South America (Argentina, Bolivia, Chile and Perú). B. cinerea has a more restricted and disjunct distribution mostly in the lowlands of South America (Argentina, Bolivia, Brazil and Paraguay). These two species present an allopatric distribution

3.2 Raster of species richness

Species richness is the number of different species represented in an ecological community, landscape or region. Species richness is simply a count of species, and it does not take into account the abundances of the species or their relative abundance distributions.

Now, let’s create a map that represent the species richness of Furnariides.

First create an empty raster for the Neotropics using the extent of the Furnariides ranges under a spatial resolution of 1º long-lat or 111 km at the equator.

Code
# Create raster ro store richness values
neo_ras <- rast() # empty raster

ext(neo_ras) <- ext(franges) # Set the raster "extent" 

res(neo_ras) <- 1 # Set the raster "resolution" 

neo_ras # print the raster object in the console
class       : SpatRaster 
dimensions  : 84, 75, 1  (nrow, ncol, nlyr)
resolution  : 1, 1  (x, y)
extent      : -110.0817, -35.08174, -55.98222, 28.01778  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
Code
values(neo_ras) <- 0 # assign O values to all pixels in the raster

Plot empty raster for the Neotropics

Code
# transform the sf object to a sp object
SApoly_sp <- as(SApoly, "Spatial") 

# Plot empty raster
plot(neo_ras)

plot(SApoly_sp, add = TRUE) ## overlay SA countries to the SR map

Now using the empty raster we will rasterize the species identities in each cell or pixel. The resulting raster will be the species richness of Furnariides across the Neotropics.

Code
f_sr_raster <- terra::rasterizeGeom(x = vect(franges), # species geographical ranges
                         y = neo_ras, # empty raster
                         fun = "count" # count number of species per grid cell
                         )
# this process can take a while depending on your computer (~15 secs in Jesús's computer), please be patient.

Plot the resulting raster.

Code
plot(f_sr_raster)

Let’s try changing the colors using the package {viridis}

Code
# country subset
Apoly <- worldMap %>% 
  filter(continent == "South America" | continent == "North America")

# transform the sf object to a sp object
Apoly_sp <- as(Apoly, "Spatial") 

# Plot the information
plot(mask(f_sr_raster, Apoly), # raster of species richness
     col = viridis::turbo(10), # colors
     zlim = c(min(values(f_sr_raster), 
                  max(values(f_sr_raster)))), 
     main = "Furnariides species richness") 

plot(Apoly_sp, add = TRUE) ## overlay SA countries to the SR map

Or we can try a more fancy way to plot the number of Furnariids’ species. To do that we can use the package {rasterVis} for plotting and the package {RColorBrewer} for selecting color combinations.

Code
library(rasterVis)
Loading required package: lattice
Code
library(RColorBrewer)

# First set a theme
mapTheme <- rasterTheme(region = rev(brewer.pal(11, "Spectral")),
  layout.widths = list(right.padding = 10),
  axis.line = list(col = "transparent"),
  tick = list(col = 'transparent'))

## Now we can plot the raster
p_furna_SR <- levelplot(mask(f_sr_raster, Apoly),
  maxpixels = 1e10,
  margin = FALSE, 
  main = list('Furnariides \n species richness', col = 'darkgray'), 
  par.settings = mapTheme,
  scales = list(x = list(draw = TRUE),
                y = list(draw = TRUE)),
  zlim = c(0, 180)) 

p_furna_SR

Awesome, right?. Now, please, describe the observed pattern!

Answer: The map shows the classic latitudinal diversity gradient, with a higher accumulation of species close to the equator and a continued decline in the number of species as we approach the South Pole.

Save the figures

Code
pdf(file = "figures/Figure_1_SR.pdf", 
    width = 8, height = 10)

p_furna_SR

dev.off()
quartz_off_screen 
                2 

3.3 Correlative relationships - species richness as a function of evolutionary history

Let’s try to rasterize another information from the polygon data set. We will use the information in the column RD, this data correspond to the numbers of nodes from the tips to the root of a phylogenetic tree or just root distance, thus, will use the RD to calculate the MRD metric (mean root distance) that measures the evolutionary derivedness of species within an assemblage (kerr_relative_1999?) and can be used to determine whether a local fauna is constituted primarily by early-diverged or by recently originated species (hawkins_different_2012?; pinto-ledezma_geographical_2017?). In other words, high MRD values means that the community (i.e., grid-cell) is composed mostly by recently originated species, whereas low MRD values by early-diverged species.

Code
franges
Simple feature collection with 217 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -110.0817 ymin: -55.98222 xmax: -34.79012 ymax: 28.50845
Geodetic CRS:  WGS 84
First 10 features:
                     SCINAME  Perimeter       Area    AreaKM2 RD      DR
1       Automolus infuscatus 191.602729 489.178106 4891781.06 17 0.29114
2     Thamnophilus nigriceps  34.725692  17.264407  172644.07 16 0.21878
3     Formicivora intermedia 143.592812 182.345788 1823457.88 13 0.21958
4      Hypocnemis flavescens  36.615112  41.179272  411792.72 14 0.17335
5         Hellmayrea gularis  56.535598  13.231539  132315.39 15 0.10925
6  Hypocnemoides melanopogon 187.260741 284.324993 2843249.93 13 0.12006
7       Syndactyla guttulata   7.685427   1.023303   10233.03 20 0.48874
8       Synallaxis brachyura 132.134056  33.176726  331767.26 26 0.39620
9  Epinecrophylla spodionota  35.361844   5.711455   57114.55 14 0.22306
10         Geositta punensis  29.576323  33.542530  335425.30 10 0.16401
   BodyMass DietInv DietFruit DietSeed StratGroun Understory Midhigh Canopy
1     32.90      90         0        0          0         80      20      0
2     22.90     100         0        0         20         60      20      0
3     10.00     100         0        0         20         40      40      0
4     12.29     100         0        0         30         40      30      0
5     12.40     100         0        0         30         70       0      0
6     14.10     100         0        0         50         50       0      0
7     36.00     100         0        0          0         50      50      0
8     18.30      90         0       10         50         50       0      0
9     10.70     100         0        0          0         50      50      0
10    25.80      50         0       50        100          0       0      0
       Ages                       geometry
1  1.339967 MULTIPOLYGON (((-78.33172 -...
2  2.552121 MULTIPOLYGON (((-78.66335 9...
3  5.693174 MULTIPOLYGON (((-50.62085 -...
4  1.705715 MULTIPOLYGON (((-73.02071 0...
5  6.082210 MULTIPOLYGON (((-75.06168 -...
6  6.714462 MULTIPOLYGON (((-49.38612 -...
7  0.751854 MULTIPOLYGON (((-63.92406 9...
8  1.886419 MULTIPOLYGON (((-80.44446 -...
9  3.687198 MULTIPOLYGON (((-70.86371 -...
10 1.643349 MULTIPOLYGON (((-69.96078 -...

Rasterize the species’ Root distance to create a map of Mean Root Distance.

Code
f_MRD_raster <- terra::rasterize(vect(franges), # polygons
                          neo_ras, # empty raster
                          field = "RD", # Root distance
                          fun = mean # function mean
                          )

Plot the results

Code
plot(f_MRD_raster, 
     main = 'Furnariides mean root distance')

plot(Apoly_sp, add = TRUE)

Let’s try changing the colors.

Code
## Now we can plot the raster
p_furna_MRD <- levelplot(f_MRD_raster,
  maxpixels = 1e10,
  margin = FALSE, 
  main = list('Furnariides \n mean root distance', col = 'darkgray'), 
  par.settings = mapTheme,
  scales = list(x = list(draw = TRUE),
                y = list(draw = TRUE)),
  zlim = c(0, 25))

p_furna_MRD

Based on the description provided above, please describe the MRD pattern

Answer: The map of mean root distance (MRD) shows an inverse pattern when compared with the patterns of species richness. Specifically, areas/pixels with a high number of species present low MRD, and areas/pixels with a low number of species have high MRD values. This might suggest that areas with low number of co-occurring species are mostly composed by recently-evolved species, conversely, areas with high number of species by species that diverged early in the evolutionary history of the clade.

Let’s plot both raster.

Code
par(mfrow = c(1, 2))

plot(mask(f_sr_raster, Apoly), 
     col = viridis::plasma(10), 
     main = "Furnariides species richness")

plot(mask(f_MRD_raster, Apoly), 
     col = viridis::plasma(10), 
     main = "Furnariides mean root distance")

Code
#dev.off()

Check if there is a relationship between the species richness and the evolutionary derivedness.

Code
cor.test(values(f_sr_raster), values(f_MRD_raster))

    Pearson's product-moment correlation

data:  values(f_sr_raster) and values(f_MRD_raster)
t = -26.141, df = 1601, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.580359 -0.511682
sample estimates:
     cor 
-0.54694 

Or as in the previous lab, we can create a model that explain the association.

Code
obj <- lm(values(f_sr_raster) ~ values(f_MRD_raster))

summary(obj)

Call:
lm(formula = values(f_sr_raster) ~ values(f_MRD_raster))

Residuals:
    Min      1Q  Median      3Q     Max 
-29.040  -5.547  -1.025   3.851  43.733 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          51.84281    1.34533   38.53   <2e-16 ***
values(f_MRD_raster) -2.42258    0.09267  -26.14   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.932 on 1601 degrees of freedom
  (4697 observations deleted due to missingness)
Multiple R-squared:  0.2991,    Adjusted R-squared:  0.2987 
F-statistic: 683.3 on 1 and 1601 DF,  p-value: < 2.2e-16
Code
# get pixel values from raster richness
data_sr <- as.data.frame(f_sr_raster, xy = TRUE) 

# get pixel values from raster MRD
data_mrd <- as.data.frame(f_MRD_raster, xy = TRUE)

# Combine both datasets
data_sr_mrd <- left_join(data_sr, data_mrd, by = c("x", "y")) %>% 
  rename(SR = area, MRD = RD) %>% 
  drop_na(MRD)

# Plot the association
data_sr_mrd %>% 
  ggplot(aes(x = MRD, y = SR)) + 
  geom_point(color = "darkgray", size = 3, alpha = 0.5) + 
  geom_smooth(method = "lm") + 
  labs(x = "Mean Root Distance", 
       y = "Species Richness") + 
  theme_classic() + 
  theme(
    axis.text = element_text(size = 15, colour = "black"),
    axis.title = element_text(size = 18) 
    )
`geom_smooth()` using formula = 'y ~ x'

Hmmm. What happened in here?

From the mean root distance map, it is possible to explain the Furnariides diversity gradient? If so, please explain from an evolutionary perspective.

Answer: Species-rich areas present low MRD values, and species-poor areas present high MRD values. MRD is a metric that measures deriveness, and high MRD means that assemblages are composed of recently evolved species. This can also be interpreted as these pixels/areas also present higher speciation rates. On the contrary, areas with low MRD will present low rates of speciation, and the higher species richness in these areas is a product of steady but low rates of speciation.

Save the figures

There are multiple options to save the figures. Jesús particularly like saving his figures in PDF. To save the figures in a pdf file, you can use the following code.

pdf(file = “figures/Figure_2_association_MRD_SR.pdf”, height = 5, width = 7)

data_sr_mrd %>% ggplot(aes(x = MRD, y = SR)) + geom_point(color = “darkgray”) + geom_smooth(method = “lm”) + labs(x = “Mean Root Distance”, y = “Species Richness”) + theme_classic()

dev.off()

This lines will save your figure in your working directory.

3.3.1 Precipitation and Temperature

Load the environmental variables that correspond to bio1 (Annual Mean Temperature) and bio12 (Annual Precipitation). These data correspond to two variables out of 19 from WorldClim (http://www.worldclim.org/current). We will use these two variables just for educational purposes, rather to make a complete evaluation of the species-environmental relationships.

Code
# Annual Mean Temperature
bio1 <- rast("data/bio1.bil")
bio1 <- bio1/10

# Annual Precipitation
bio12 <- rast("data/bio12.bil")
bio12
class       : SpatRaster 
dimensions  : 900, 2160, 1  (nrow, ncol, nlyr)
resolution  : 0.1666667, 0.1666667  (x, y)
extent      : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
source      : bio12.bil 
name        : bio12 
min value   :     0 
max value   :  9916 

Plot the environmental variables

Code
plot(bio1) 

Code
plot(bio12)

Ok, the bio1 and bio12 layers are at global scale, so now will need to crop them to the extent of the Neotropics.

Code
bio1_neo <- crop(bio1, ext(franges))
bio12_neo <- crop(bio12, ext(franges))
Code
par(mfrow = c(1, 2))

plot(bio1_neo, main = "Annual Mean Temperature", 
     col = rev(viridis::inferno(10)))

plot(bio12_neo, main = "Annual Precipitation", 
     col = rev(viridis::inferno(10)))

Much better!

Obtain the values from bio1, bio12, SR and MRD for each cell or pixel using the coordinates.

Code
# Get environmental data using coordinates from our maps
f_ras_bios <- extract(x = c(bio1_neo, bio12_neo), # environmental variables
                      y = data_sr_mrd[, c("x", "y")]) %>% # coordinates
  rename(MAT = bio1, MAP = bio12)

# Combine the information
fdata <- bind_cols(data_sr_mrd, f_ras_bios)

head(fdata)
          x        y SR MRD ID  MAT  MAP
1 -109.5817 27.51778  1  15  1 24.1  532
2 -108.5817 27.51778  1  15  2 19.6  799
3 -108.5817 26.51778  1  15  3 24.4  624
4 -107.5817 25.51778  1  15  4 23.3 1007
5 -106.5817 24.51778  1  15  5 23.0 1124
6 -106.5817 23.51778  1  15  6 24.8  753

Now make a simple correlation between the Furnariides richness and bio1 and bio12.

Species richness correlated with Temperature

Code
cor.test(fdata$SR, fdata$MAT)

    Pearson's product-moment correlation

data:  fdata$SR and fdata$MAT
t = 17.846, df = 1601, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3656740 0.4473788
sample estimates:
      cor 
0.4073411 

Species richness correlated with Precipitation

Code
cor.test(fdata$SR, fdata$MAP)

    Pearson's product-moment correlation

data:  fdata$SR and fdata$MAP
t = 29.339, df = 1601, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5585221 0.6222577
sample estimates:
      cor 
0.5913125 

And also the linear model…

Code
lmbio1 <- lm(SR ~ MAT, data = fdata)

summary(lmbio1)

Call:
lm(formula = SR ~ MAT, data = fdata)

Residuals:
    Min      1Q  Median      3Q     Max 
-20.064  -6.906  -2.198   5.378  43.153 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.43243    0.86046   2.827  0.00476 ** 
MAT          0.68641    0.03846  17.846  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 9.744 on 1601 degrees of freedom
Multiple R-squared:  0.1659,    Adjusted R-squared:  0.1654 
F-statistic: 318.5 on 1 and 1601 DF,  p-value: < 2.2e-16
Code
lmbio12 <- lm(SR ~ MAP, data = fdata)

summary(lmbio12)

Call:
lm(formula = SR ~ MAP, data = fdata)

Residuals:
    Min      1Q  Median      3Q     Max 
-43.561  -4.867  -1.078   3.599  42.582 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 6.1166677  0.4334826   14.11   <2e-16 ***
MAP         0.0072354  0.0002466   29.34   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.604 on 1601 degrees of freedom
Multiple R-squared:  0.3497,    Adjusted R-squared:  0.3492 
F-statistic: 860.8 on 1 and 1601 DF,  p-value: < 2.2e-16

Which environmental variable is more related with Furnariides richness?

Answer: Both MAT and MAP are positively associated with the number of species. However, MAP shows a more steeper association than MAT. Also, MAP (R^2 = 0.43) explain more variance than MAT (R^2 = 0.24).

Please explain the relationship from an ecological perspective

Answer: The positive association between MAP and MAT with the number of species suggests that wetter and hotter areas contain more species. This might suggest that these areas present high local heterogeneity, allowing more species to coexist or to accumulate over time.

Code
fdata %>% 
  ggplot(aes(x = MAT, y = SR)) + 
  geom_point(color = "darkgray") + 
  geom_smooth(method = "lm") + 
  labs(x = "Mean Annual Temperature", 
       y = "Species Richness") + 
  theme_classic() + 
  theme(
    axis.text = element_text(size = 15, colour = "black"),
    axis.title = element_text(size = 18) 
    )
`geom_smooth()` using formula = 'y ~ x'

Code
fdata %>% 
  ggplot(aes(x = MAP, y = SR)) + 
  geom_point(color = "darkgray") + 
  geom_smooth(method = "lm") + 
  labs(x = "Annual Precipitation", 
       y = "Species Richness") + 
  theme_classic() + 
  theme(
    axis.text = element_text(size = 15, colour = "black"),
    axis.title = element_text(size = 18) 
    )
`geom_smooth()` using formula = 'y ~ x'

3.3.2 Precipitation and Temperature for Minnesota

US states information was obtained from here

Read spatial information

Code
US_states <- st_read("data/States_shapefile-shp/States_shapefile.shp") 
Reading layer `States_shapefile' from data source 
  `/Users/jpintole/Library/CloudStorage/Dropbox/Teaching/NOL_Itasca/NoL-2024/data/States_shapefile-shp/States_shapefile.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 51 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -178.2176 ymin: 18.92179 xmax: -66.96927 ymax: 71.40624
Geodetic CRS:  WGS 84
Code
glimpse(US_states)
Rows: 51
Columns: 7
$ FID        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, …
$ Program    <chr> "PERMIT TRACKING", NA, "AZURITE", "PDS", NA, "ECOMAP", "SIM…
$ State_Code <chr> "AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FL",…
$ State_Name <chr> "ALABAMA", "ALASKA", "ARIZONA", "ARKANSAS", "CALIFORNIA", "…
$ Flowing_St <chr> "F", "N", "F", "F", "N", "F", "F", "P", "P", "P", "N", "F",…
$ FID_1      <int> 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930,…
$ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-85.07007 3..., MULTIPOLYGON (…

Filter the state of Minnesota

Code
MN <- US_states %>% 
  filter(State_Name == "MINNESOTA")

Plot the United States and highlight Minnesota

Code
ggplot() +
  geom_sf(data = US_states, alpha = 0.5) +
  geom_sf(data = MN, aes(fill = State_Name), alpha = 0.7, size = 2) +
  scale_fill_scico_d(palette = "davos", direction = 1, 
                     end = 0.9) +
  theme_classic() + 
  theme(
    plot.background = element_rect(fill = "#f1f2f3"),
    panel.background = element_rect(fill = "lightblue"),
    panel.grid = element_blank(),
    line = element_blank(),
    rect = element_blank()
  )

Get precipitation and temperature for the state of Minnesota

Code
MN_mat <- crop(bio1, MN)

MN_ap <- crop(bio12, MN)

Plot Minnesota temperature and precipitation

Code
par(mfrow = c(1, 2))

plot(MN_mat, main = "Annual Mean Temperature", 
     col = rev(viridis::inferno(10)))
plot(as(MN, "Spatial"), add = TRUE, lwd = 2, border = "green")

plot(MN_ap, main = "Annual Precipitation", 
     col = rev(viridis::inferno(10)))
plot(as(MN, "Spatial"), add = TRUE, lwd = 2, border = "green")

The end! for now…